Cross-lingual transfer of abstractive summarizer to less-resource language
نویسندگان
چکیده
Automatic text summarization extracts important information from texts and presents the in form of a summary. Abstractive approaches progressed significantly by switching to deep neural networks, but results are not yet satisfactory, especially for languages where large training sets do exist. In several natural language processing tasks, cross-lingual model transfer is successfully applied less-resource languages. For summarization, was attempted due non-reusable decoder side models that cannot correct target generation. our work, we use pre-trained English based on networks sequence-to-sequence architecture summarize Slovene news articles. We address problem inadequate using an additional evaluation generated language. test with different amounts data fine-tuning. assess automatic measures conduct small-scale human evaluation. shows summaries best useful quality similar trained only Human generates high accuracy acceptable readability. However, other abstractive models, perfect may occasionally produce misleading or absurd content.
منابع مشابه
ABSUM: a Knowledge-Based Abstractive Summarizer
ive summarization is one of the main goals of text summarization research, but also one of its greatest challenges. The authors of a recent literature review (Lloret and Palomar 2012) even conclude that “abstractive paradigms [...] will become one of the main challenges to solve” in text summarization. In building an abstractive summarization system, however, it is often hard to imagine where t...
متن کاملCross-Lingual Word Embeddings for Low-Resource Language Modeling
Most languages have no established writing system and minimal written records. However, textual data is essential for natural language processing, and particularly important for training language models to support speech recognition. Even in cases where text data is missing, there are some languages for which bilingual lexicons are available, since creating lexicons is a fundamental task of doc...
متن کاملCross-Lingual Lexico-Semantic Transfer in Language Learning
Lexico-semantic knowledge of our native language provides an initial foundation for second language learning. In this paper, we investigate whether and to what extent the lexico-semantic models of the native language (L1) are transferred to the second language (L2). Specifically, we focus on the problem of lexical choice and investigate it in the context of three typologically diverse languages...
متن کاملPorting a Summarizer to the French Language
Résumé. Nous présentons dans cet article l’adaptation de l’outil de résumé automatique REZIME à la langue française. REZIME est un outil de résumé automatique mono-document destiné au domaine médical et s’appuyant sur des critères statistiques, syntaxiques et lexicaux pour extraire les phrases les plus pertinentes. Nous décrivons dans cet article le système REZIME tel qu’il a été conçu et les d...
متن کاملWord Alignment and Cross-Lingual Resource Acquisition
Annotated corpora are valuable resources for developing Natural Language Processing applications. This work focuses on acquiring annotated data for multilingual processing applications. We present an annotation environment that supports a web-based user-interface for acquiring word alignments between English and Chinese as well as a visualization tool for researchers to explore the annotated data.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Intelligent Information Systems
سال: 2021
ISSN: ['1573-7675', '0925-9902']
DOI: https://doi.org/10.1007/s10844-021-00663-8